Generic Two-Qubit Photonic Gates Implemented by Number-Resolving 

Photodetection 



Dmitry B. Uskov, A. Matthew Smith, and Lev Kaplan 
Department of Physics, Tulane University, New Orleans, Louisiana 70118, 



USA 



o 
o 

(N 

o 
O 



Oh 
I 



> 
00 

od 

o 

ON 

o 



We combine numerical optimization techniques [Uslcov et al., Phys. Rev. A 79, 042326 (2009)] 
with symmetries of the Weyl chamber to obtain optimal implementations of generic linear-optical 
KLM-type two-qubit entangling gates. We find that while any two-qubit controUed-U gate, including 
CNOT and CS, can be implemented using only two ancilla resources with success probability S > 0.05, 
a generic 5*17(4) operation requires three unentangled ancilla photons, with success 5 > 0.0063. 
Specifically, we obtain a maximal success probability close to 0.0072 for the B gate. We show that 
single-shot implementation of a generic 5'?7(4) gate offers more than an order of magnitude increase 
in the success probability and two-fold reduction in overhead ancilla resources compared to standard 
triple-CNOT and double-B gate decompositions. 



PACS numbers: 42.50.Ex, 03.67.Lx, 42.50.Dv 



Generation of and operation on quantum states of light 
at the single-photon level are important topics of re- 
search in the field of theoretical and experimental quan- 
tum information and metrology P, [3, S] • Due to the low 
rate of decoherence, photonic states are capable of carry- 
ing quantum information over large distances, enabling 
quantum state teleportation and distribution of entan- 
glement across quantum networks. To build a universal 
all-optical quantum computer, one may couple photons 
through their interaction with atomic media resulting ei- 
ther in nonlinear but unitary interaction Q or in two- 
photon dissipative coupling and Zeno-type non-unitary 
evolution However, nonlinear effects are vanishingly 
small for field intensities at the single-photon level, and 
the feasibility of these approaches is still unclear. 

A more explicit and straightforward solution to the 
problem of photon coupling was suggested in a semi- 
nal work by Knill, Laflamme, and Milburn Q. In the 
KLM scheme, linear optical operations are performed on 
photons in computational and ancilla modes, followed 
by a measurement of the ancilla modes using number- 
resolving photocounting as shown in Fig. [1] 
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FIG. 1: (Color online) A scheme for an LOQC transformation. 
The computational input state is a separable state of two or 
more dual-rail encoded qubits. The ancilla state may be a 
separable state, an entangled state, or an ebit state carrying 
spatially distributed entanglement Q. 

Importantly, the bosonic statistics of photons, and the 
availability of experimental sources for generating pho- 
tons indistinguishable in the spatial degrees of freedom. 



allow the quantum Hong-Ou-Mandel effect [8| to be ex- 
ploited. The fidelity of such transformations strongly de- 
pends on mode mismatch (for experimental details of 
the errors involved, see for example Ref. [3])- How- 
ever, recent progress in the technology of manufactur- 
ing microchips for optical interferometers, and the im- 
provement of down-conversion sources of entangled pho- 
ton pairs fi\ inspires strong optimism that linear optical 
quantum computation (LOQC) will become a practical 
quantum technology in the near future. 

An LOQC measurement-assisted transformation is 
schematically illustrated in Fig. 1. The total input state 
is |*[*°''^')) - iMn""^) ^ l^^'^"""-)), where l*!™'"^^) 
is a computational state in Nc modes and |\[/(ancina)^ 
an ancilla state in Na modes. In the Fock representa- 
tion, this input state is determined by occupation num- 
bers ni, . . . , riN,, and n^v^+i, . . . , un^+Nc, for the compu- 
tational modes and ancilla modes, respectively. 

As indicated in Fig. [2 the linear optical device 
acts on the modes by a unitary transformation a\ — > 
'Y^^=i ^ij^l @i where C/ is a unitary N x N matrix as- 
sociated with the concrete optical device, and S,^ are 
creation operators of the input and output modes, re- 
spectively, and N is the total number of modes (including 



vacuum modes, if any). Then j^*; 



(total) I 



is transformed as 
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where is a homogeneous polynomial function in entries 
of the matrix U 

Ideally, the photocounting measurement projects 
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onto some state (fcjv^_|_i,A: 



ancilla modes The computational state (unnormal- 
ized) becomes 



(2) 



2 



where A = A{U) is a Kraus operator [Hi, \\A\\ < 1. All 
relevant properties of the LOQC transformation ^ are 
determined by the matrix A, whose entries are again ho- 
mogeneous polynomials in the elements of U (specifically, 
they are permanents of U (To|). 

Denoting the desired transformation matrix by A*^*^"^) , 
we can define the fidelity F and operational success 
probability S of the transformation ^ via the Hilbert- 
Schmidt scalar product {A\B) = TT{AB'f )/Dc in the D.- 
dimensional computational Hilbert space as follows |13| : 

F{U) = |(A|A(*"))|V(A|A)(A(*^'-)|A(*^^)) 

(3) 

S{U) = {A\A) . 

Since the success of the LOQC gate operation hinges on 
the outcome of a stochastic measurement process, the 
optimization of success probability S using minimal re- 
sources while maintaining perfect fidelity _F = 1 is the 
key problem for practical implementation of LOQC. This 
optimization problem does not have a general algebraic 
solution due to the algebraic complexity of the fidelity 
and success functions. Even in the simplest case of a 
CNOT gate, the analytic solution was identified using nu- 
merical methods, but never proved analytically to be a 
global maximum [l^, The only known example of 
an analytically optimizable LOQC gate is the Nonlinear 
Sign (ns) gate, where convexity of the success probability 
function allows for global optimization in the restricted 
case of an unentangled ancilla resource 15]. Even the 
problem of numerically calculating the coefficients of A 
belongs to the #P-complete complexity class [16]. 

Recently, several successful implementations of analyt- 
ical and numerical optimization have been reported in the 
literature for basic gates, including the CNOT gate [l3| . 
Toffoh gate [1, [H, [l3], and Fredkin gate [H]. One im- 
portant result of these works was a demonstration that 
LOQC transformations allow one to bypass the standard 
circuit paradigm of quantum computing, in which the 
complete quantum calculation is constructed as a prod- 
uct of concatenated standard two-qubit gates. For ex- 
ample, the standard circuit scheme for implementing the 
Toffoli gate using six CNOT gates 19^ results in a very 
small success probability of (2/27)^, whereas only three 
two-qubit gates are required in the analytic scheme [13], 
with a success probability of (2/27)^/2 (here we assume a 
non-entangled ancilla resource and non-destructive her- 
alded implementation of the gate). Further improvement 
in the success probability is obtained by a single-shot 
numerical optimization technique [l3| . where the target 
gate is not decomposed into two-qubit gates but, instead, 
implemented as a single LOQC transformation. 

Universal two-qubit unitary gates constitute the core 
of current schemes for quantum information processing, 
since an arbitrary SU{2^) unitary operation can be im- 
plemented as a series of two-particle transformations. 
The success of the single-shot (block-optimization) tech- 
nique applied to CNOT and Toffoli transformations in- 
spired us to further investigate the problem of minimizing 



overhead resources and maximizing success probabilities 
for an arbitrary SU{A) two-qubit gate. 

Until now, the only two-qubit photonic gate system- 
atically studied in the literature was the CS gate, equiv- 
alent to the CNOT gate via local Hadamard transforma- 
tions, and consequently represented by the same point 
{7r/2,0,0} in the Weyl chamber (see below). Both gates 
belong to the class of two-qubit controUed-unitary gates 
C^U. The CNOT gate is indeed one of the most universal 
gates, since an arbitrary SU (4) gate may be constructed 
using three CNOT gates jl^. However, CNOT is less uni- 
versal than the B gate, discovered recently by Zhang et 
al. [23]. An arbitrary S'C/(4) transformation can be con- 
structed as a product of only two B gates (plus local qubit 
rotations). 

The success probability of the CNOT gate and required 
photonic resources are well established [H, [l3]- CNOT 
requires two unentangled ancilla photons, and the max- 
imal success probability is 2/27 « 0.074. Surprisingly, 
adding ancilla resources does not affect the success proba- 
bility . Consequently, six unentangled ancilla photons 
will be required to implement a generic two-qubit trans- 
formation and the success probability of such a transfor- 
mation will be rather small, S = {2/27)^. In this work we 
optimize generic two-qubit gates directly [3], including 
the B gate as a special case. This will allow us to de- 
termine whether a combination of two B gates (or three 
CNOT gates) is more efficient than direct implementation 
of a generic two-qubit coupling gate. 

Our method is based on conjugate gradient algorithms 
for maximizing the fidelity and success probability func- 
tions ([3]) in the space of unitary matrices U. Fidelity is 
a differentiable (but nonanalytic) function of the entries 
of U, defined by equation ^ even when the domain of 
matrices U is extended by relaxing the unitarity require- 
ment in favor of improving the efficiency of numerical 
optimization (unitarity is then easily recovered by a uni- 
tary dilation procedure if the resulting optimal matrix is 
found to be non-unitary). The success probability func- 
tion S is differentiable in the space of unitary matrices 
U, but exhibits singular behavior as a function of U in 
the extended search space. The optimization starts by 
optimizing the fidelity F until a point of perfect fidelity 

= 1 is identified, and a penalty function approach is 
then used to optimize S in the vicinity of the F = 1 
subspace, finally leading to a local maximum of S within 
the F — 1 subspace (values of F obtained numerically are 
better than 0.999999). The process is repeated with mul- 
tiple random starting points U to obtain the best success 
rate for a given target gate. 

A generic two-qubit transformation V G S'[/(4) can 
be implemented as a product of local single-qubit pre- 
and post-rotations Xpll, Xp^c, XpJg^, X^^J^^ e SU{2), and 
an entangling operation characterized by only three real 
parameters {ci, 02,03} [2ll . [2^ . This is known as the 
Cartan KAK decomposition: 

^ ^^post^^post*^ ^*-prc pre ' \ f 
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Two gates are equivalent up to local rotations plus tt- Success Drohabilitv no"'' 
shifts if and only if the triplets {cj, c^, eg} and {-^ 1.4 , ^ ^-i 



-1 ' ^-2 1 ^3 



} 



can be transformed into each other by the action of the 
Weyl group. These transformations are listed explicitly 
in the first three rows of Table HI along with the local ro- 
tations generating them. The resulting equivalence class 
in the space of {ci, 02,03} is known as the Weyl cham- 
ber HI: < C3 < C2 < ci < TT - C2. Half of the Weyl 
chamber (ci < 7r/2) is shown in Fig. [2l with several im- 
portant gates identified explicitly. 



Weyl Symmetry 


Local Rotation 


{C1,C2,C3} ^ {ci,±C3,±C2} 


exp(if(a^±a2)) 


{C1,C2,C3} ^ {±C2,±Ci,C3} 


exp(^(a^±a^)) 


{C1,C2,C3} ^ {±C3,C2,±Ci} 


exp(i^(ai±f72)) 


Other Symmetry 


Transformation 


{Cl, C2, C3} {tt - Ci, C2, C3} 


Conjugation + Local 


{C1,C2,C3} <-* {^-C3,^-C2,^-Cl} 


SWAP + Conj -1- Local 



TABLE L Transformations on {ci, 02,03} that preserve the 
success probability 5*. The first three transformations are 
generated by local qubit rotations, as shown, and define the 
Weyl chamber. Two additional symmetries are specific to 
LOQC and allow the space of {ci, C2, C3} to be reduced to one 
quarter of the Weyl chamber. 
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FIG. 2: (Color online) Half of the Weyl Chamber for the 
Cartan decomposition of the SU{4:) group. 

Two additional symmetries are present in the prob- 
lem, allowing us to restrict our attention to one quar- 
ter of the Weyl chamber. First, suppose that a uni- 
tary transformation U acting on the photon modes pro- 
duces a Kraus operator A{U) acting on the compu- 
tational input state, and implements a two-qubit gate 
V = ei(=i'^-'^-+'=2'^!''^!'+=3'^-'^-) with fidelity F{U) and suc- 
cess probability S{U). From Eqs. (P), it is evident 
that A{U*) — A{U)* , and thus U* implements the gate 
V* with the same fidelity and success rate. Now the 
gate V* is associated with the triplet {— Ci, — C2, — C3}, 
which via local transformations and 7r-shifts maps to 
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FIG. 3: Left panel: Success probability for the C^U gates 
along the 0-CNOT line (c2 = C3 = 0), as a function of ci. 
Right panel: Success probability for gates along the CNOT-B 
line (ci — n/2, C3 — 0), as a function of C2. Dotted curves 
indicate continuations of optimal families of solutions. 



{tt — Ci,C2,C3} in the Weyl chamber. Thus, {01,02,03} 
and {tt— Cl, C2, C3} have the same maximal success proba- 
bility. This symmetry corresponds geometrically to a re- 
flection through the CNOT-A2-SWAP plane, and permits 
us to consider only the left half of the Weyl chamber: 
< C3 < C2 < Cl < 7r/2, which is shown in Fig. O 

Secondly, we notice that the swap operation 
{7r/2, 7r/2, 7r/2} corresponds simply to permutation of the 
photon modes, and may always be implemented with 
perfect fldelity and success. Thus the maximal success 
for {ci,C2,C3} is the same as the maximal success for 
{ci -I- 7r/2, C2 + 7r/2, C3 -t- 7r/2}, which by local rotations 
and a 7r-shift maps to {7r/2 — C3, 7r/2 — C2, 7r/2 — ci} inside 
the half-chamber obtained previously. This last symme- 
try corresponds to reflection through the B-VswAP line 
in Fig. [21 and allows us to focus on a quarter chamber de- 
fined, for example, by vertices O, CNOT, A2, and ^/swAP. 

Each point inside this region of the Weyl chamber rep- 
resents a distinct gate with its own maximal success prob- 
ability and minimal number of ancillary photons required 
to implement it. Gates along the 0-CNOT edge are equiv- 
alent to the controlled unitary gates C^U [2^. We find 
that all these gates require only two ancilla resources to 
attain prefect fidelity. The maximal success probability 
as a function of ci is shown in Fig. [3] (left panel). Inter- 
estingly, the optimal solution in each case takes the form 
previously observed by Knill for ci = tt/2 (cnot) and 
Cl = 7r/4 i.e., the 6 x 6 U matrix acts trivially on 
two of the computational modes. 

The optimization process may be aided substantially 
by considering an = 1 solution obtained for a given 
gate {ci,C2,C3} and using it at a starting point for op- 
timization of nearby gates {c'i,C2,C3}. This procedure 
results in a family of locally optimal solutions in a region 
of the Weyl chamber. Of course, the continuation of a 
globally optimal solution at {ci, C2, C3} is not guaranteed 
to remain globally optimal forever, and one must in gen- 
eral consider multiple such families. As seen in Fig. [3] 
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(left panel), the optimal success curve for the 0-CNOT 
line consists of only three distinct families of solutions. 
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Photons 


3 CNOT gates 


{2/27f « 0.000406 


6 


2 B gates 


> (0.00717)^ ^ 0.00005 


6 


Single shot 


> 0.0063 
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TABLE II: Success probabihties and needed resources for im- 
plementing general two-qubit gates, using CNOT decomposi- 
tion, B decomposition, and single-shot design. 

Away from the 0-CNOT hne (or the A2-SWAP line, 
which is equivalent to it), all two-qubit gates require three 
ancillas to obtain F = 1 with S > 0. A systematic inves- 
tigation of all non-equivalent gates in the Weyl chamber, 
on a cubic lattice with spacing 7r/16 in {ci, C2, C3} space, 
shows that all SU (4) gates on this lattice may be imple- 
mented with perfect fidelity and a success probability not 
lower than 0.0063. Again, the solutions may be classified 
into a number of families. In the right panel of Fig. [31 we 
show the best success probability obtained for gates lying 
along the CNOT-B line (not including the CNOT gate it- 
self, which requires fewer resources). As indicated in the 
figure, four families of solutions are found to be globally 
optimal at different points along this edge of the Weyl 
chamber. The B gate has success probability w 0.0072. 

Our main result is summarized in Table |TTJ which 
shows the success probability and resources required to 
implement a generic two-qubit SU (4) gate using three 
CNOT gates, two B gates, or single-shot design. 

In conclusion, we use numerical optimization tech- 



niques [l3l | to find optimal implementations of generic 
linear-optical KLM-type two-qubit entangling gates, rep- 
resented by generic points in the Weyl chamber of 
Khaneja's KAK decomposition of the SU{4:) group. Sym- 
metries of the Weyl chamber are identified, and used to 
aid the optimization process. A solution at one point 
in the Weyl chamber may be continuously deformed to 
obtain a family of locally optimal solutions; several such 
families are needed to obtain globally optimal solutions 
at all points in the Weyl chamber. We find that while 
any two-qubit controlled-U gate, including CNOT and CS, 
can be implemented using only two ancilla resources with 
success probability S > 0.05, a generic SU{4) operation 
requires three unentangled ancilla photons. Our study 
indicates that single-shot implementation of a generic 
S't/(4) gate offers more than an order of magnitude in- 
crease in the success probability and two-fold reduction in 
overhead ancilla resources compared to standard triple- 
CNOT and double-B gate decompositions. The B gate, 
which is the most efficient deterministic gate for decom- 
posing an arbitrary SU (4) transformation, has success 
probability close to 0.0072. In the context of probabilis- 
tic KLM-type transformations, this makes the B gate less 
efficient than the CNOT gate as a building block for arbi- 
trary SU{4:) transformations. Our results are consistent 
with previous work on the Deutsch-Toffoli gate, where 
direct implementation of this three-qubit operation was 
shown to be four orders of magnitude more efficient than 
six- fold decomposition into CNOT gates [l3lll7|. 
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